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Clustering U.S. Law Schools Using Variables That Describe 
Size, Cost, Selectivity, and Student Body Characteristics 



INTRODUCTION 



The law school admission process and particularly the role of the LSAT in that process have been studied 
widely for more than 40 years. A common practice among these many studies is to gather data from a 
sample of law schools and then generalize the results from the sampled schools to all of legal education. 
Frequently, the sample is drawn from among those schools that meet criteria for providing an amount of 
data sufficient for analysis (e.g., Wightman & Muller, 1990; Rock and Evans, 1982; Pitcher, 1977; Powers, 
1977). Implicit in these studies is the assumption that law schools are sufficiently similar so that 
legitimate generalizations can be made about the topic of study for all of legal education based on data 
from the schools included in the sample. Despite the fact that the 176 American Bar Association (ABA) 
accredited U.S. law schools have a large number of characteristics in common, not the least of which is 
a virtually identical first year curriculum, there are no data to support complete fungibility among the 176 
schools. A number of current and anticipated research efforts supported or under consideration by the 
Law School Admission Council (LSAC) require greater attention to the legitimacy of generalizing from 
the sample to the total population of law schools. 

This cluster analysis study was undertaken to determine whether a discrete and stable grouping of law 
schools exists when a variety of characteristics of the schools and their students are considered 
simultaneously. The first step was to identify an appropriate and meaningful set of variables on which 
to group or cluster the schools. The next was to select an analytical tool for quantifying similarities and 
differences among the schools. Because the focus of this investigation is evaluating the similarities across 
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the population of schools, cluster analysis techniques as defined by Johnson (1967) were explored as a 
means of partitioning the schools into optimally homogeneous groups on the basis of the selected 
empirical measures of similarity among each school. Cluster analysis is an empirical classification 
methodology. The entire universe of U.S. ABA accredited law schools potentially was available for 
analysis, and cluster analysis proved to be an appropriate methodology. More specifically, sequential 
agglomerative hierarchical clustering methods were used to analyze the law school d ata 

METHOD 

Selecting Variables that Describe Similarities Among Law Schools 

Research designed specifically to describe or evaluate systematic variation am ong ABA accredited law 
schools does not seem to be available, but data about law schools provided by the ABA and by the LSAC 
suggest a series of variables on which law schools may differ in ways that are important to the outcomes 
of many research studies about legal education. Work done in other areas of higher education was 
reviewed for guidance about which of the available data might be most fruitful in defining the dimensions 
that most differentiated participating law schools. A search of the literature revealed some research that 
focuses on undergraduate education (Anderson, 1982 provides a review of these studies) for the purpose 
of defining educational climates that can be used as treatments in a variety of educational effect studies. 
These studies provided some guidance as to which variables might be most useful in grouping together 
law schools with similar educational climates. Additionally, a recent study by Shavelson et al. (1988) 
identified a set of variables that resulted in meaningful clustering of graduate schools of business and 
management. 
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The review of the literature combined with discussions with individuals knowledgeable about legal 
educations were used to define relevant dimensions of variability for the law schools included in this 
study. The individuals consulted were members of the LSAC Test Development and Research Committee, 
members of the LSAC Minority Affairs Committee, and selected other faculty, deans, or admission 
professionals. The initial work suggested that the dimensions that would define the most important 
similarities and differences among law schools are those summarized in Table 1. 

Table 1 

Selected Variables Used to Describe 
Law School Characteristics 



Variable 

Size 

ENR90.FA 

RATIOFS 

Diversity 

MF1PCT90 

FF1PCT90 

Admissions 

ACCPCT90 

LSAT_Md90 

MEDFLGPA 

Cost 

TUI_R_90 

PUBPRV 



Description 



Full time fall ’90 enrollment 
Faculty student ratio. 

{ SUM(ENR90_FA,ENR90PA*2/3)/FAC_FT90} 

Percent first-year full-time minority students (YR1_FM90/YR1_FT90) 
Percent first-year full-time female students (YR1_FF90/YR1_FT90) 

Percent accepted (Naccepted/Napplicants)*100 

Median LSAT score for full time students, fall 1990 entering class 

Median UGPA for full time students, fall 1990 entering class. 

FT annual tuition and fees (residents) (1990) 

Status of law school (Public/Private) 
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Data Source 



For each of the U.S. law schools included in this study, quantifiable data related to admission criteria, size, 
student body diversity, and cost of attending were obtained from the Official Guide to U.S. Law Schools 
(1990-91) (Law School Admission Council/Law School Admission Services, 1990) and from data gathered 
through the American Bar Association 1990 law school survey. 

Selecting Law Schools 



All U.S. ABA accredited law schools were considered for inclusion in this study, but a few schools 
ultimately were excluded. Among the schools that were not included are the three law schools located 
in Puerto Rico. A majority of law school classes at these schools are conducted in Spanish, and Spanish 
is the native language of attending students. Because of the language differences, these schools were not 
included in the cluster analysis. In addition, one law school that enrolls part-time students almost 
exclusively and one school that failed to provide data about the entering credentials of its students in 
response to the 1990 ABA survey were excluded. All cluster analysis procedures were carried out using 
data from the remaining 171 law schools. 



Selecting Variables 



From the initial variable list shown in Table 1, two of the variables were eliminated from the final 
analyses. Based on preliminary analyses, the percentage of first-year female students and status of law 
school (public vs private) were determined to be unusable. There was negligible between-school variance 
for percentage of female students, indicating that the variable would not add any information to the 
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clustering. Status of law school (either public or private) was correlated very highly with tuition and fees 
so that status was a redundant variable and would not add any information to the clustering, but it would 
serve to weight the cost factor if this variable were retained in the analysis. The remaining seven variables 
were used in all subsequent analyses. The means and standard deviations across the 171 law schools 
included in this study on each of the seven clustering variables are presented in Table 2. 



Table 2 



Means and Standard Deviations for Seven Variables 
Used to Cluster 171 U.S. ABA Accredited Law Schools 



Variable 


Mean 


STD 


LSAT 


36.6082 


3.9755 


GPA 


3.2036 


0.2264 


TUITION 


8179.1579 


4808.0857 


TOTAL ENROLLMENT 


748.3860 


375.6588 


SELECTIVITY 


0.3176 


0.1104 


PCT MINORITY 


0.1606 


0.1235 


F/S RATIO 


23.0291 


4.3375 



As is apparent from the data presented in Table 2, the variables considered in this study are reported on 
a variety of scales. Most importantly. Table 2 shows sizeable differences among the means and standard 
deviations of these variables. Because variables with large variances tend to have more effect on the 
resulting clusters than variables with small variances, the variables were standardized prior to the 
application of any of the cluster analyses. The z-score standardization procedure, setting the mean to zero 
and the standard deviation to one, was used to transform all the clustering variable values prior to applying 
any of the clustering algorithms. Standardized variable scores are reported for the between cluster 
comparisons as an aid to interpreting the distinguishing characteristics among clusters. 
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Clustering Procedures 

The term cluster analysis is used to describe a variety of statistical methods designed to create empirical 
groupings of objects. The theoretical properties of the variety of algorithms that fall under this generic 
term are considered in detail in the broad literature on cluster analysis (Anderberg, 1973; Cormack, 1971; 
Everitt, 1980; Lorr, 1983). This study initially considered several of the sequential agglomerative 
hierarchical cluster analysis methods for analysis of the law school data. Hierarchical techniques are most 
appropriate when the primary goal of the study is to discover a taxonomic structure in the set of data, but 
may be less appropriate when used alone if the goal it to form clusters that are highly homogeneous. 

Each of the hierarchical methods begins by considering each school as a separate cluster. Each level of 
clustering joins two clusters by selecting from among the clusters those two that are most similar. The 
clustering procedure continues until either a stopping rule is encountered or all of the schools have been 
combined into a single cluster. Some unique properties of the hierarchical clustering procedures of 
particular interest are (1) the clusters are always nonoverlapping, (2) once two schools become members 
of the same cluster, they are never again separated, and (3) with the addition of each new school to the 
cluster, the centroid of the cluster is recalculated. An unfortunate consequence of this latter property is 
that schools already in the cluster could become more distant from the centroid of the parent cluster than 
from the centroid of some other clusters. Thus the subsequent clusters could become increasingly 
heterogeneous. One remedy for this situation (Feild and Schoenfeldt, 1975) is to use a nonhierarchical 
clustering algorithm as a relocation procedure by which a school is reassigned to another cluster if the 
distance to the centroid of that cluster is less than the distance to the centroid of the parent cluster. 
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A review of the literature was used to select from among the many alternatives the most appropriate 
cluster analysis method for partitioning the data set of law schools. Ward’s method followed by a 
nonhierarchical A:- means procedure was identified as the method of preference, partly on theoretical 
grounds and partly as a consequence of a review of empirical and Monte Carlo studies evaluating the 
available clustering methods. Both the Ward method and the fc-means method merge clusters in a way 
that will minimize the increase in the total within group sum of squares. Thus, they are both biased 
toward forming spheroidal clusters, the consequence of which is that the resulting clusters tend to be 
homogeneous. 

The literature consistently supports the use of Ward’s method among the many available hierarchical 
clustering methods. In an extensive review of clustering methods, Milligan and Cooper (1987) noted that 
among the hierarchical clustering procedures, single linkage, average linkage, complete linkage, and 
Ward’s methods are the most commonly tested algorithms. They concluded that Ward’s method tended 
to perform well in all the cases where it was tested for its ability to capture existing clusters in data. The 
performances of the average linkage and complete linkage methods were more erratic, while the single 
linkage method was repeatedly shown to provide poor cluster recovery and to be negatively affected by 
small amounts of error in the data. The most definitive study still is Blashfield’s (1976) Monte Carlo 
comparison of four hierarchical clustering techniques— single linkage, average linkage, complete linkage, 
and Ward’s methods. In that study, Ward’s (1963) method was found to yield the highest accuracy. Prior 
to Blashfield’s study, several empirical studies found some support for the relative superiority of the 
average linkage cluster analysis method (Cunningham & Ogilive, 1972; Rohlf, 1970; Sneath, 1966; Sokal 
& Rohlf, 1962). 
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In order to address the problem of possible heterogeneity within law school clusters generated by Ward’s 
method, a nonhierarchical clustering algorithm similar to MacQueen’s (1967) k - means methods was used 
to relocate any schools that were closer to the centroid of a different cluster than to the parent cluster to 
which the hierarchical method had assigned them. As noted previously, the fc-means procedures is like 
the Ward procedure in that it seeks to optimize the error sum of squares and is biased toward producing 
spheroidal clusters. Forming spheroidal clusters of law schools using a relocation algorithm would ensure 
a greater degree of homogeneity among the schools in the cluster — a most desirable outcome if we want 
to sample from the cluster and then generate the findings to schools not selected for the sample. 

Validating Cluster Results 

The final step in any cluster analysis study is to validate the results. In Monte Carlo studies, the adequacy 
of the clustering process can be evaluated in terms of the degree to which the method(s) are able to 
recapture the natural structure built into the simulated data When empirical data are used, as is the case 
in this study, there is no prior knowledge of the natural structure in the data, if indeed a natural structure 
exists at all. That is, all of the hierarchical clustering algorithms give solution partitions regardless of 
whether there is any true structure in the data. One way to test or validate the clustering results when 
empirical data have been used in the clustering process is to replicate the clustering process using a variety 
of clustering algorithms. Different clustering methods can, and usually do, produce different results. If 
the cluster structure remains fairly consistent across different clustering methods, it would support the 
conclusion that the clustering identified a real structure within the data, and not simply an artifact of the 
particular clustering method selected. To test the validity of any cluster structure suggested by the Ward 
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method followed by a fc-means relocation adjustment, the similarity of the results from each of the average 
linkage, complete linkage, and single linkage methods were compared with Ward’s. These methods differ 
from one another in terms of the criterion used to determine which two clusters to merge at each level. 
After the 171 law schools were partitioned into clusters using each of the clustering methods, the results 
from the different methods were compared to determine whether the analyses had revealed real structure 
to the data. Two methods for relocating schools following the cluster assignments from each of the 
hierarchical clustering algorithms were used. First, the hierarchical method was used to assign schools 
to the optimal number of clusters. Using the centroids of those clusters as seeds, schools were then 
relocated to the nearest seed using SAS’s FASTCLUS procedure. The relocation step was repeated until 
the change in the cluster seeds became zero. In the second method, the hierarchical clustering was stopped 
short of the optimal number of clusters and the centroids of the clusters formed at that point were used 
as starting seeds for the FASTCLUS procedure. The FASTCLUS procedure was then used to further 
reduce the data to the optimal number of clusters and to relocate schools assigned to those clusters until 
the change in cluster seeds became zero. Use of this hybrid procedure is supported by Milligan and 
Cooper’s findings (1985) that the convergent fc-means method tended to give the best recovery of cluster 
structure. 

Description of the Four Hierarchical Clustering Methods Used in this Study 

Each of the four clustering methods used in this study are described separately, with the following notation 
common across the four descriptions. This notation is consistent with the one used by SAS (SAS/STAT 
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User’s Guide 1990) because the SAS PROC CLUSTER procedures were used for all hierarchical cluster 
analyses reported in this study. 

n number of observations (in this study, n=171 law schools) 
x ; ith observation 

C K Kth cluster 

N k number of observations in cluster K (C K ) 

X K mean vector for cluster C K 

II X || Euclidean length of the vector x 

d(x,y) any distance or dissimilarity measure between observations or vectors x and y 

Dkl any distance or dissimilarity measure between clusters C K and C L 

Ward’s Method. Ward (1963) and Ward and Hook (1963) produced a general hierarchical clustering 
method that most often uses the ANOVA sum of squares to determine the clusters that should be merged 
at each stage. The objective is to find the two clusters whose merger results in the minimum increase in 
the total within groups sum of squares. For example, when clusters K and L are merged to form cluster 
M, the increase in the total within group error sum of squares across all clusters is 

'^kl = Em * E k - E l 

where = the error sum of squares for new cluster M (i.e., the sum of Euclidean distances from each 
data point in cluster M to the mean vector of cluster M, - 
E k = the error sum of squares for cluster K, and 

E l = the error sum of squares for cluster L. 
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The distance between two clusters is defined as 

d^\\x k - X L V i (1 in k * 1INJ 

It follows that in Ward’s method the distance between two clusters is the ANOVA sum of squares between 
the two clusters summed over all the variables. The minimum increase in the error sum of squares is 
proportional to the squared Euclidean distance between the centroids of the merged clusters. Ward’s 
method tends to produce clusters with approximately equal numbers of observations, and the method is 
very sensitive to outliers. 

Average Linkage Method . In average linkage (Sneath and Sokal, 1973), each cluster is characterized by 
the average of all links within it. Thus, in this method, the distance between the two clusters is the 
average distance between pairs of observations, one in each cluster such that 

D kl = eC i /AA) 

As a result, average linkage tends to join clusters with small variances, but the method frequently produces 
results that are little different from those obtained with the complete linkage method. 

Complete T inkage Method In the complete linkage method, each cluster is characterized by the longest 
link needed to connect every member of a cluster to every other member. This method is called complete 
linkage because all schools in the cluster are linked to each other at some maximum distance. The 
distance between two clusters is defined as 

D kl = max iBCK max jeCL d(Xj,Xj) 
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where 

Dkl is the distance between the most distant members of clusters K and L. The interpretation 
of the clusters formed by complete linkage is in terms of within cluster relationships. Unlike Ward’s 
method or the Average Linkage method, the distance between clusters does not provide particularly useful 
information. Like Ward’s method, the results from the complete linkage method can be seriously affected 
by outliers. 

Single Linkage Method . In the single linkage method, the distance between two clusters is the minim um 
distance between an observation in one cluster and observation in the other cluster. That is, the distance 
between two clusters is defined as 



D kl = min iECK min jKCL d(x i ,x j ) . 



Despite the fact that the single linkage method has been widely studied and applied, and is both intuitively 
appealing and theoretically attractive, Milligan and Cooper (1987) noted in their methodology review of 
clustering methods that the single linkage method has been shown to give poor cluster recovery and to 
be seriously affected by the presence of even small amounts of error in the data. Based on the research 
to date, this method is expected to correlate least well with the cluster assignments produced by the other 
three hierarchical methods. 

Determining the Optimal Number of Clusters 



Hierarchical agglomerative methods do not determine the number of clusters in the data set Thus, an 
external stopping rule must be applied as part of the cluster analysis procedure. The three methods for 
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suggesting the optimal number of clusters that are available in the SAS cluster analysis program were 
examined for the law school clustering. 

Cubic Clustering Criterion. The cubic clustering criterion (CCC) (Sarle, 1983) is one of the stopping rule 
statistics available in the SAS package. It is described by Milligan and Cooper (1985) as an index that 
is the product of two terms: the natural logarithm of (1-E(R 2 )/(1-R 2 ) and ((np/2)‘ 5 )/((.001+E(R 2 )) 1 ' 2 ), where 
R 2 is the proportion of variance accounted for by the clusters and p is an estimate of the dimensionality 
of the between cluster variation. The expected value, of R 2 is determined under the assumption that the 
data have been sampled from a uniform distribution based on a hyperbox. In a study that examined thirty 
procedures for determining the number of clusters in a data set, Milligan and Cooper found that the cubic 
clustering criterion is among the best of the available options. When it was in error, it was more likely 
to produce too many than too few options. That is, if the index makes an error, it is more likely to result 
in an incomplete clustering of the data. Two procedures were incorporated into this study to guard against 
selecting too many clusters for the law school data. First, two alternative stopping rules were applied to 
the same data in order to obtain confirmation of the optimal number of clusters. Additionally, several 
clustering procedures were compared to determine the consistency of the clustering results. 

Pseudo F. The pseudo F statistic measures the separation among all the clusters at the current level. 



Pseudo t 2 . The pseudo t 2 statistic measures the separation between the two clusters most recently joined. 
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Measuring the Similarity of Obtained Clusters across Clustering Methods 

The cluster assignments resulting from application of the various clustering methods typically do not agree 
perfectly. However, if the methods result from a stable underlying taxonomy, rather than from an artifact 
of the particular method selected, there should be substantial agreement among the methods. There are 
several alternatives for evaluating the similarity across methods (e.g., Borko et al., 1968; Green and Rao, 
1969; McIntyre & Blashfield, 1980; Rand, 1971). The Borko et al. procedure uses a simple contingency 
table to depict the similarity of classifications between two methods. The c statistic focuses on the joint 
membership of pairs of data in the same cluster across methods. Green and Rao’s method is basically 
equivalent to the c statistic. The McIntyre and Blashfield’ s kappa statistic requires matching each cluster 
in the solution to one of the population in the mixture. The matching requirement is problematic because 
the way in which the matching is accomplished can severely inflate or underestimate the size of kappa. 

Rand’s c statistic was used to evaluate the clustering results for the law school data. The similarity, c, 
between two clusters L and M for the same data is defined as 

c(L,M) = (N(N-l)/2 - { l/2[ E, ( Ejiiy) 2 + Ej ( Eaj) 2 ] - E£;fy)/[N(N-iy2], 

where n tJ is the number of schools simultaneously in the i‘ h cluster of L and the j' h cluster of M. The 
statistic c ranges from 0 to 1. 
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RESULTS 



Optimal Number of Clusters 

The results from the several procedures for determining the optimal number of clusters were not perfectly 
consistent either across criteria within clustering method nor across methods using the same criteria. 
Statistics for determining the optimal number of clusters resulting from applying each of the three 
procedures to each of the four clustering methods (Wards’ method, the average linkage method, the 
complete linkage method, and the single linkage method) are shown in Table 3. In order to examine the 
presence of an effect from outliers, each of the procedures for determining the optimal number of clusters 
was run twice — one time with no trimming and one time with 10 percent trimming. In order to more 
easily identify the optimal number of clusters suggested by each procedure, the statistics presented in 
Table 3 are plotted against number of clusters for values of one to 20 clusters. Plots of the cubic 
clustering criteria by number of clusters for each of four of the clustering analyses are shown in Figures 
la through Id. Likewise, overlaid plots of the pseudo F and pseudo t 2 statistics by number of cluster are 
shown in Figures 2a through 2d. Arrows on the plots point to the optimal number of clusters suggested 
by each analysis. Neither the CCC nor the pseudo F statistics are included for the single linkage method. 
Because the single linkage method tends to chop off tails of distributions, neither of those statistics are 
appropriate for it. The pseudo t 2 can be used by looking for large values. The optimal number of clusters 
is one more than the level of the large t 2 statistic. 



For Ward’s method with no trimming, the CCC shows no sharp peaks and the pseudo F statistic peaks 
at 4 and 5 clusters. The pseudo t 2 statistic plummets at 6 and falls even lower at 12 and 19 clusters. For 
Ward’s method with 10 percent trimming, the CCC has a peak at 6 clusters and possibly another at 11. 
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Statistics for Determining the Optimal Number of Clusters 
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Ward's Method— With Trimming 
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Complete Linkage Method 

Pseudo F/t**2 by No. of Clusters 
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The pseudo F statistic peaks only at 3 clusters, but the pseudo t 2 drops at both 6 and 9 clusters. The CCC 
has a sharp peak at 7 and 13 for the average linkage method. Consistent with the suggestion of 7 clusters 
by average linkage, the pseudo F for that method peaks at 7 and 13 and the pseudo t 2 statistic drops at 
7 and 13. Notice that the pseudo t 2 drops to its lowest point at 15 clusters. For the complete linkage 
method, the CCC peaks at 6 and again at 13. The pseudo F statistic peaks only at 6, while the t 2 statistic 
drops sharply at 6 clusters and falls slightly lower at 7. The t 2 statistic reaches approximately the same 
value at 13 clusters. For the single linkage method, the largest pseudo t 2 statistic occurs at 3 and 15, 
suggesting 4 or 16 clusters. The number of law schools is small enough that the value of partitioning into 
13 or more clusters is questionable. There is slightly more support for 6 clusters than for 7. All 
subsequent analyses specified a 6 cluster solution. 

Final Cluster Assignments 

Law schools were assigned to one of six clusters using one of the following procedures: 

1. ) Law schools were grouped into 12 clusters using each of the Ward’s, average linkage, 

complete linkage, and single linkage methods. These twelve clusters were then relocated 
and fused to form six final clusters using the nonhierarchical centroid method employed 
by the SAS program FASTCLUS. 

2. ) Each of the clustering methods, Ward’s, average linkage, complete linkage, and single 

linkage, were used to create the six clusters. The FASTCLUS procedure was then used 
to relocate the law schools at the same level. 

The number of schools assigned to each of the six clusters that resulted from these procedures are shown 
in Figure 3. The dendogram shown in Figure 3 also depicts the level at which the tree structure was 
formed for each of the groups. 
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Figure 3 

Dendogram for Six Law School Clusters 
Using Seven Clustering Variables 

All Law Schools 



Cluster 4 


Cluster 5 


Cluster 6 


Cluster 1 


Cluster 2 


Cluster 3 


ON 

r-H 

It 

55 


' N = 18 


N = 8 


N = 52 


N = 21 


N = 53 



Overlap in Clustering Methods 

The Rand c statistic was then calculated to evaluate how well the different methods converged on a final 
clustering solution. The c statistics between each clustering method are shown in Table 4. The data 
suggest substantial though not perfect convergence. The clustering from Ward’s 6-to-6 solution (i.e., a 
six cluster solution generated from the Ward’s method, followed by a relocation algori thm that used the 
Ward six cluster solution centroids as starting seeds) correlated very highly with the results from the other 
solutions. As was anticipated, the single linkage method correlated the least well with the Ward 6-to-6 
results. As a further check on the validity of the Ward 6-to-6 clustering results, those cluster assignments 
were compared with the results from the Ward’s six cluster hierarchical solution with no subsequent 
relocation. The Rand overlap coefficient between Ward’s 6 cluster solution with no further adjustments 
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Table 4 

Rand Coefficients for Alternative Clustering Methods 
Each Followed by Nonhierarchical Relocation 




Ward 12 Ward 6 Ave 12 Ave 6 Comp] 12 


Comp] 6 


Sing 12 


Sing 6 


Centr 12 


Centr 6 


Ward 12 


.8686 .8777 .8514 .7729 


.9319 


.8438 


.8712 


.8576 


.8065 


Ward 6 


.9301 .9045 .8270 


.8993 


.7475 


.7931 


.8607 


.8429 


Ave 12 


.8811 .7915 


.9103 


.7600 


.8163 


.8416 


.8096 


Ave 6 


.8211 


.9030 


.7545 


.8017 


.8363 


.8001 


Compl 12 




.7838 


.6934 


.7011 


.8147 


.8424 


Compl 6 






.8008 


.8700 


.8491 


.7961 


Singl 12 








.8863 


.7822 


.7340 


Singl 6 










.7417 


.8010 


Centroid 12 












.8954 


Centroid 6 















and Ward’s 6 cluster solution with relocation is .8711. The Ward’s 6-to-6 solution is used for all 
subsequent analyses. The strong overlap between the solutions with and without relocation in addition 
to the strong overlap between the 12-to-6 and the 6-to-6 solutions from the other clustering procedures 
provide evidence that an appropriate grouping of law schools has been identified. 



Canonical Discriminant Analysis of Law School Clusters 



In order to obtain a graphical display of the relationships within and among the clusters, a canonical 
discriminant analysis of the law school clusters was conducted. That is, for the seven clustering variables 
and the six cluster groups, a discriminant analysis was carried out using the canonical correlation approach. 
The canonical correlations and the standardized canonical coefficients for the seven law school clustering 
variables are shown’in Tables 5 and 6. The R 2 between the first canonical variable and the cluster variable 
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Table 5 

Canonical Correlations Between the Sets of Seven Law School 
Clustering Variables and Six Law School Clusters 





Canonical 

Correlation 


Adjusted 

Canonical 

Correlation 


Approx 

Standard 

Error 


Squared 

Canonical 

Correlation 


Eigenvalue 


1 


0.895915 


0.887359 


0.015135 


0.802663 


4.0675 


2 


0.825188 


0.811926 


0.024471 


0.680936 


2.1342 


3 


0.778320 


0.776023 


0.030235 


0.605782 


1.5367 


4 


0.670198 


0.670032 


0.042247 


0.449166 


0.8154 


5 


0.378857 


0.373286 


0.065688 


0.143533 


0.1676 



Table 6 

Total-Sample Standardized Canonical Coefficients 
for Seven Law School Clustering Variables 





CAN1 


CAN2 


CAN3 


CAN4 


CANS 


LSAT 


0.419182876 


-0.214128179 


-0.937669174 


-0.260321032 


-0.229521517 


GPA 


0.228077224 


0.293860145 


-0.048892147 


-0.132765087 


1.316843530 


TUITION 


1.080712191 


0.293860145 


1.169464143 


-0.853444366 


0.247582160 


TOTENR 


0.411340396 


-0.082980128 


0.55057676 


1.346156756 


0.146842638 


SELECT 


-0.411206166 


-0.947463867 


0.354536951 


-0.006142111 


1.057183410 


PCTMEN 


-0.989956539 


1.226197447 


0.511452388 


0.112799907 


0.457025577 


FSRATIO 


0.368843162 


0.094883049 


0.172380460 


0.205143215 


-0.257461233 



is .8027, which is slightly higher than the corresponding R 2 for the second canonical variable, .6809, 
suggesting that the first canonical variable has slightly more discriminating power than the second. The plot 
of the first two canonical variables (Figure 4) shows the discriminating power for both canonical variables. 
The data in Figure 4 demonstrate that clusters 2 and 6 are the most widely separated by the first and second 
canonical variables. Cluster 6 schools have the lowest values on the fust canonical function and the highest 
on the second. Because the R 2 for the third canonical variable is almost as large as the R 2 for the second 
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canonical variable, the third variable was plotted against the first canonical variable and is shown in Figure 
5. The standardized means on each of the clustering variables for each cluster aid in understanding the 
relative positions of the six clusters seen in Figure 4 and Figure 5. These means are discussed more in 
detail in the next section of this report. 



Figure 4 

Plot of Canonical Variables Identified by Cluster Analysis Using Ward’s Method 

Followed by Relocation 

Plot of CAN2*CAN1. Symbol is value of CLUSTER. 
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Figure 5 

Plot of Canonical Variables Identified by Cluster Analysis Using Ward’s Method 

Followed by Relocation 

Plot of CAN3*CAN1. Symbol is value of CLUSTER. 
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Comparisons Among Clusters 

As an aid to describing the similarities and differences among the six clusters of law schools, a 
multivariate analysis of variance was carried out using the seven variables included in the vectors used 
to form the clusters. Not surprisingly, the differences among the clusters are highly significant. Individual 
ANOVA’s and post hoc comparisons using the Tukey-Kramer method for unequal sample sizes (Tukey, 
1953; Kramer, 1956) were used to identify differences between the clusters on specific variables. 

The results from these analyses are shown in Table 7 and Table 8. The means for each of the seven 
variables are shown for each cluster in Table 7. Comparisons of the differences between standardized 
means among the clusters are shown in Table 8. All the variables are standardized to mean 0, standard 
deviation 1. 



Table 7 

Standardized Means for the Seven Clustering Variables 
by Cluster 



Variable 


Cluster 1 


Cluster 2 


Cluster 3 


Cluster 4 


Cluster 5 


Cluster 6 


LSAT 


0.2630 


-1.0873 


-0.2764 


0.7340 


1.3702 


-1.8508 


GPA 


0.3597 


-0.6911 


-0.5002 


0.6091 


1.2914 


-1.5176 


TUITION 


-0.9771 


-0.4237 


0.6759 


0.6187 


1.1399 


-1.0487 


TOTENR 


-0.3776 


-0.6184 


0.1312 


1.9121 


-0.1180 


-1.0668 


SELECT 


-0.3215 


1.6871 


0.2474 


-0.4885 


-1.3021 


0.1123 


PCTMIN 


-0.0997 


-0.6600 


-0.3591 


0.2552 


0.3281 


3.4154 


FS RATIO 


-0.4346 


-0.3213 


0.3921 


1.1782 


-0.2284 


-1.2133 
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Comparison of Differences of Standardized Means Among Clusters 
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The data in Table 7 can be used to describe the similarities among the schools in each cluster. The data 
in Table 8 can be used to interpret the importance of the observed differences among and between clusters. 



Cluster 6 includes schools with the largest proportion of minority students. The average percentage of 
minority students for schools in cluster 6 is significantly larger than the average percentage found among 
schools in any other cluster. These schools also have the lowest tuition, the smallest enrollments and the 
lowest faculty student ratios. Both the undergraduate grade point averages and the LSAT scores of 
students attending schools in cluster 6 are the lowest among any of the clusters. The comparison of 
differences shown in Table 7 reveal that the mean LSAT score for cluster 6 is significantly lower 
(alpha=.05) than the mean for each of the other clusters, while the mean UGPA is significantly lower for 
each cluster except cluster 2. 

Cluster 4 includes the schools with the largest enrollment and the highest faculty student ratio. The means 
for each of these variables are significantly different from the means for each of the other clusters. The 
entering credentials of students attending cluster 4 schools are the second highest among the six clusters, 
although they are significantly lower than those of students attending cluster 5 schools and not 
significantly different from students attending cluster 1 schools. Cluster 4 schools are among the most 
highly selective, although the proportion selected is not significantly less than the proportion selected by 
cluster 1 and cluster 6 schools. The schools in cluster 1 have student bodies with admission credentials 
(i.e., LSAT scores and UGPAs) that are almost identical to cluster 4 schools. Cluster 1 and cluster 4 
schools also have approximately the same percentage of accepted students. The differences between 
schools in cluster 1 and schools in cluster 4 primarily are size and cost, with cluster 1 schools being both 
significantly smaller and significantly less costly. Schools in both clusters have average, and nearly 
identical proportions of minority students. 



Cluster 2 includes the smallest of the law schools yet the distinguishing feature of these schools is that 
they have the largest percentage of acceptances. That is, the proportion accepted is significantly larger 
than at the schools in any of the other clusters. Cluster 2 schools, which are the lowest cost schools, have 
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students with very low entering credentials (LSAT scores and UGPAs) and they enroll the smallest 
proportion of minority students among all of the clusters. 

The schools in cluster 5 are the most expensive. Additionally, they enroll students with LSAT scores and 
UGPAs that are significantly higher than those found at each of the other clusters, and accept the smallest 
proportion of applicants. Cluster 5 schools also enroll the second highest proportion of minority students, 
although the percentage is significantly less than the percentage found at cluster 6 schools. 

Cluster 1 and cluster 3 include the largest number of schools — 52 and 53, respectively. The schools in 
both clusters are about average in size, in the LSAT scores and UGPAs of their entering students, and in 
the percentage of students accepted. Even so, cluster 3 schools are significantly larger, significantly more 
expensive, and accept a significantly higher percentage of their applicants than cluster 1 schools. Cluster 
1 schools are among the least expensive. Additionally, the entering credentials of students attending 
cluster 3 schools are significantly lower than those attending cluster 1 schools. The faculty student ratio 
at cluster 3 schools also is significantly higher than at schools in cluster 1. The ethnic diversity at schools 
in cluster 1 and cluster 3 is approximately the same, with the percentage of minority students being 
slightly though not significantly smaller at cluster 3 schools. 

Comparisons of Nearest Centroid Law Schools 

To further describe each cluster, as well as to help evaluate the homogeneity of the obtained clusters, the 
law school closest to the centroid of each cluster was identified. The scores on each of the seven variables 
for each of the nearest centroid law schools are presented in Table 9. The law school most typical of 
cluster 1 attracts a fairly able student body — one that presents both LSAT scores and undergraduate grade 
point averages more than half a standard deviation above the mean. Distinguishing characteristics of this 
school are its low tuition and it small faculty student ratio. Both of these factors and its relatively small 
size are likely related to the fact that it is among the most selective of the schools. Despite its low tuition 
and fees, this school has a very small percentage of minority students. In contrast to this cluster 1 school, 
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the school most typical of cluster 4 has a student body virtually identical in terms of its entering 
credentials, but both size and costs that are substantially higher. Despite its increased cost, this school 
has far fewer faculty per student and considerably more minority students. 



Table 9 

Standardized Scores for Cluster Centroid Schools on Law School Variables 



Variable 


Cluster 1 


Cluster 2 


Cluster 3 


Cluster 4 


Cluster 5 


Cluster 6 


LSAT 


0.6016 


-1.1592 


-0.4045 


0.6016 


1.3563 


-2.1653 


GPA 


0.5580 


-0.4577 


-0.8551 


0.4255 


1.3087 


-1.2526 


TUITION 


-1.0085 


-0.8164 


0.5243 


1.0686 


1.6058 


-1.4892 


TOTENR 


-0.2672 


-0.7570 


-0.1687 


1.5616 


-0.3790 


-1.1430 


SELECT 


-0.4313 


1.3808 


0.4747 


-0.0689 


-1.1561 


0.3841 


PCTMIN 


-0.2476 


-0.1666 


-0.2476 


0.4000 


0.6429 


2.9904 


FSRATIO 


-1.0765 


-0.6592 


0.7656 


1.6118 


-0.1358 


-0.4909 



The school at the centroid of cluster 2 is a small school with relatively low tuition and a student body with 
relatively low entering credentials. Its most distinguishing characteristic among the seven variables is that 
it is the least selective among the six centroid schools although the mean selectivity for cluster 2 confirms 
that this school is not the least selective among the schools that comprise that cluster. The cluster 3 
centroid school also has a student body with relatively low LSAT scores and UGPAs. The cluster 3 
centroid school differs from its cluster 2 counterpart in that its tuition is considerably higher, its size is 
smaller, and the proportion accepted is lower. The proportion of minority students is virtually identical 
among the cluster 1, cluster 2, and cluster 3 centroid schools. 



The school nearest the centroid of cluster 4 is the largest of the nearest centroid law schools although it 
is not quite as large as the mean for cluster 4. Thus it is not the largest of the law schools. It is a high 
tuition school and it has a student body with fairly strong entering credentials. This school also has the 
largest faculty student ratio among the nearest centroid schools. This ratio also is nearly half a standard 
deviation larger than the mean for its cluster. The cluster 5 centroid school has an even more 
academically-able student body, as measured by LSAT score and UGPA, and it has higher tuition, but the 
size of this school is significantly smaller than the cluster 4 nearest centroid school. It is the most 
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selective among the nearest centroid schools, but not the most selective school in cluster 5. This school 
also reports one of the largest percentages of minority students among schools not in cluster 6. 

Consistent with the description of cluster 6, the school nearest the cluster 6 centroid is distinguished by 
the percentage of minority students it enrolls. It also is the smallest of the nearest centroid schools and 
has the lowest tuition. Both its size and tuition also are slightly smaller than the mean for cluster 6 
schools. The mean LSAT scores and UGPAs for students at this school are the lowest among the centroid 
schools. 



Comparison of the standardized scores on the seven clustering variables for the nearest centroid schools 
with the standardized means for the corresponding cluster based on all schools in the cluster confirm that 
the clusters are fairly homogenous and that the nearest centroid school provides a good description of the 
cluster. This comparison also highlights the importance of considering the statistical significance of 
differences as shown in Table 8 when comparing the cluster characteristics. For example, although the 
cluster means for LSAT scores and UGPAs are lower for cluster 1 schools than for cluster 4 schools, 
Table 8 suggests that these differences are not statistically significant Table 9 shows that the LSAT 
scores and UGPAs for the cluster 1 and cluster 4 nearest centroid schools are virtually identical. 

Summary and Conclusions 

Cluster analysis methods were used to identify similarities among U.S. ABA accredited law schools. The 
analyses undertaken in this study strongly support the presence of six clusters of law schools when 
variables describing size, cost, selectivity, and student body characteristics are used to group together those 
schools that are the most similar to one another. The validity of the cluster assignments resulting from 
this study is confirmed by the strong overlap in assignments produced by each of the clustering methods. 



The majority of schools (105 of 171 schools studied) fall into one of two clusters (cluster 1 or cluster 3), 
both of which tend to represent average scores on most of the clustering variables. Even so, the two major 
clusters differ significantly from each other on every clustering variable except percentage of minority 
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students. Distinguishing characteristics for each of the other clusters are a consequence of one or more 
clustering variable scores that are considerably higher or lower than average law school means. For 
example, cluster 2 can be interpreted to represent the smallest schools with the highest proportion of 
accepted students, despite very low LSAT scores and UGPAs among their entering classes. Cluster 4 
represents the largest of the schools. These schools are highly selective and enroll an academically able 
student body, although they are surpassed on both of these variable scores by cluster 5 schools. In 
addition to being the most selective and enrolling the most academically able student body, cluster 5 
schools are the most expensive. Cluster 6 schools are distinguished by the large proportion of minority 
students they enroll as well as by their low cost and small size. 

The results from this study confirm that law schools are not fungible in terms of several important 
variables that characterize their academic climates. Research studies that wish to generalize their findings 
to all of legal education will enhance their ability to do so by sampling from each of the six clusters. 
Alternatively, research studies that are designed to focus on certain characteristics of the law school 
environment might best be served by sampling schools from one or more clusters that best represent the 
characteristics of interest. 
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